-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
images: speed up lists #7215
images: speed up lists #7215
Conversation
I don't know if I like making this part of the signature - asking someone
writing an app against Libpod to know what to do here (and to properly
handle cases where images are created and removed from under us) doesn't
seem right. Maybe a cache around getting images from the store?
…On Tue, Aug 4, 2020, 08:43 OpenShift CI Robot ***@***.***> wrote:
[APPROVALNOTIFIER] This PR is *APPROVED*
This pull-request has been approved by: *vrothberg
<#7215#>*
The full list of commands accepted by this bot can be found here
<http://?repo=containers%2Fpodman>.
The pull request process is described here
<https://git.k8s.io/community/contributors/guide/owners.md#the-code-review-process>
Needs approval from an approver in each of these files:
- OWNERS <https://github.com/containers/podman/blob/master/OWNERS>
[vrothberg]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#7215 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB3AOCE3IY4RFJKJ3FELQQTR67665ANCNFSM4PULQ62A>
.
|
Caching is very hard to get right for long-running processes as we need to invalidate the caches at some point. Using a variadic func was the lazy alternative to adding new functions with an |
@vrothberg That would be a good compromise. |
9db9ecd
to
1252f04
Compare
lgtm |
Let's wait for @mtrmac's head nod before merging. The performance diff is almost too good to be true/correct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- AFAICS this is not quite the same thing:
IsParent
matches images that have the same top layer as a “child” but has one fewer history entries than the child. - If you are worried about time complexity, the iteration over storageLayers is still O(layers) ≥ O(images) in every
isIntermediate
call, making this O(images^2); instead, you can build amap[parentLayerId]hasChild
in O(layers) and check that for all images in O(images). - Does
pkg/api/handlers/utils.GetImages
need the same change?
Aesthetically, I’m not too much of a fan of adding a separate “quick fix” instead of making the primary parent/child determination code fast. It should AFAICS possible to build a full parent/child tree in something close to O(images+layers), without changing the semantics, by
- building a layer tree
- building some kind of “history ID” (sha256sum of sha256sums of (the relevant fields of?) individual history entries?), for the full history of each image, and the the history of each image but the last entry
- building an image tree on top of these history IDs
- intersecting the layer and history trees, to get the same semantics as before
Some callers only need a single parent/child response, and those would not be (asymptotically) slower than the current code; the other callers like “remove” and “list” would fairly strongly benefit.
GetLayersMapWithImageInfo
is a starting point but does not compute exactly the same thing (i.o.w. podman image tree
and podman images list
show different trees!).
1252f04
to
14c126b
Compare
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: vrothberg The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
1067f29
to
2b345b1
Compare
LGTM, would like a head nod from @nalind |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The areParentAndChild
checks are not quite O(images), but I suspect this gets us 99% there in practice, a nice improvement already.
|
||
// Now assign the images to each (top) layer. | ||
for i := range images { | ||
img := images[i] // do not leak loop variable outside the scope |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this actually make a difference? There isn’t a closure that would survive the loop iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was pretty sure that we need it for node.images = append(node.images, img)
, don't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t think so; that line reads the contents of the img
variable at that point and appends the pointer to node.images
. If img
changes later, that doesn’t affect node.images
.
The copy/new scope is only necessary when the evaluation of a variable is deferred after the iteration terminates, e.g. runOneSecondLater(func() { consume(img) })
4c92ca4
to
9d15e5a
Compare
Listing images has shown increasing performance penalties with an increasing number of images. Unless `--all` is specified, Podman will filter intermediate images. Determining intermediate images has been done by finding (and comparing!) parent images which is expensive. We had to query the storage many times which turned it into a bottleneck. Instead, create a layer tree and assign one or more images to nodes that match the images' top layer. Determining the children of an image is now exponentially faster as we already know the child images from the layer graph and the images using the same top layer, which may also be considered child images based on their history. On my system with 510 images, a rootful image list drops from 6 secs down to 0.3 secs. Also use the tree to compute parent nodes, and to filter intermediate images for pruning. Signed-off-by: Valentin Rothberg <[email protected]>
9d15e5a
to
8827100
Compare
Updated. Thanks for the great review! @mtrmac, I created a Jira card to perform some follow-up work to wire in the layer into more APIs (and to also have a closer look at pkg/api/handlers). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
foundChildren = true | ||
children = append(children, childImage.ID()) | ||
if all { | ||
return foundChildren, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Absolutely non-blocking: A break
would be a tiny bit easier to follow for me.)
/lgtm |
/hold cancel I can fix it with the follow-up work 👍 |
@mheon, I'd love to get this into the v2.0 branch. Cool with it? |
Listing images has shown increasing performance penalties with an
increasing number of images. Unless
--all
is specified, Podmanwill filter intermediate images. Determining intermediate images
has been done by finding (and comparing!) parent images which is
expensive. We had to query the storage many times which turned it
into a bottleneck.
Instead, create a layer tree and assign one or more images to nodes that
match the images' top layer. Determining the children of an image is
now exponentially faster as we already know the child images from the
layer graph and the images using the same top layer, which may also be
considered child images based on their history.
On my system with 510 images, a rootful image list drops from 6 secs
down to 0.3 secs.
Also use the tree to compute parent nodes, and to filter intermediate
images for pruning.
Signed-off-by: Valentin Rothberg [email protected]